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Abstract We investigate exponential families of random graph distributions as a 

framework for systematic quantification of structure in networks. In this paper wc 
restrict ourselves to undirected unlabeled graphs. For these graphs, the counts of sub- 
graphs with no more than k links are a sufficient statistics for the exponential families 
of graphs with interactions between at most k links. In this framework we investigate 
the dependencies between several observables commonly used to quantify structure in 
networks, such as the degree distribution, cluster and assortativity coefficients. 
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1 Introduction 

The notion of "complex networks" is usually utilized in an informal manner, intending 
to suggest that these networks are not simple in some sense or another. Among the 
simple networks one would include regular lattices on the one side and purely random 
networks, i.e. Erdos-Renyi random graphs or Bernoulli graphs, on the other side. In 
contrast, in the physics literature, two types of networks are considered as prototypes 
of complex networks: scale free graphs, i.e. graphs with degrees distributed according 
to a power law and small world networks, i.e. graphs with a small diameter, but higher 
cluster coefficient than a Bernoulli graph with the same diameter. Thus the degree 
distribution and the cluster coefRcient are used to define certain kinds of complex 
networks. Another structural property, utilized to assess whether some graph should 
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be termed complex, is its assortativity or disassortativity, describing whether high 
degree nodes are more often connected to high degree nodes or to low degree nodes. 
These properties are not independent of each other; the degree distribution, for in- 
stance, imposes constraints on the degree of assortativity. Therefore it would be desir- 
able to have a general framework that allows to study these dependencies in a system- 
atic way and, in particular, to quantify structure and therefore complexity of networks. 
Here we propose the theory of hierarchically structured exponential families [lj|8] with 
the help of which we can, starting from the Erdos-Renyi random graphs, incorporate 
more and more interaction between parts of a network and provide a framework for 
quantifying the degree of interaction. 

In this theory "complexity" means statistical complexity 3 : In order to distinguish 
between the structure and the random part, not only one object is considered, but a 
set of objects equipped with a probability distribution. "Random" then means statisti- 
cally independent. Accordingly, measures of statistical complexity quantify statistical 
dependencies in a distribution. They vanish in both cases of a totally ordered and a 
totally random system. 

Using the notion of "statistical complexity" to characterize single objects such as a 
given network is problematic in the following sense. If we speak about the complexity 
of a single network, we have to consider it as typical in an ensemble of networks. 
This assumption need not always be justified. If it is satisfied, however, we can use 
an ergodicity-type argument to approximate ensemble means by counts over a single 
typical instance in the ensemble. For example, the count of edges in one instance should 
provide an estimate of the edge probability in the ensemble. 

This paper is structured as follows. In Section[2]we describe the theoretical framework 
of exponential families of random graphs. Building on this, Section[3]contains our main 
results. We interpret common observables on graphs in our framework and shed new 
light on their interpretation. Section |4] contains concrete examples of our models with 
few parameters and discusses special features of sampling procedures. Following the 
discussion in Section [5] is an Appendix which contains technical details. 

2 Basic setting — exponential families of random graphs 

We consider undirected random graphs without self connections, specified by an ad- 
jacency matrix A. Denote by A*' the number of nodes, [N] := {1, . . . , A'^} the set of 
nodes, and E :— G [A^] x [TV] : i 7^ j} the set of off-diagonal indices of A. In a 

given graph each edge is either present or not, therefore we are dealing with \E\ binary 
random variables. Denote by X := {0, 1}^ the set of states of this collection of random 
variables. For any subset B of potential edges we denote Xg :— {0, 1}^. 
In this labeled setting the probability of a random graph is given by the probability of 
its adjacency matrix A — {ae)eeE- A random graph G is described by binary random 
variables with state space X and a probability 

P{G) — P{aei, . ■ . ,Q.ejv(N-i)) • 

A distribution P{G) is often called a graph ensemble in the following. We consider so 
called exponential families, classes of graph ensembles with a particularly nice struc- 
ture and interpretation. When used to describe probability distributions for random 
graphs they have been termed "exponential random graphs" , "p* models" , or "logit- 
models" in the literature. For a recent overview see [llj. Here, we utilize families £1^ 
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that consist of the distributions with interactions between at most k units. Let / be a 
function mapping states (ae^, . . . ,aejv(jv-i)) to the reals. With the usual addition and 
multiplication by real numbers, these functions form a vector space 

R-^ := {f : X ^R} . 

Of course, any (real-valued) observable is such a function. Our systematic approach to 
quantifying structure consists in considering natural bases of this space. We can then 
express well known observables in terms of these bases, yielding a better understanding 
of the relation between observables. 
The probability measures 



[ x<£X J 

form a subset of R'^, which has the geometry of a simplex. Its closure, where P{x) = 
is allowed, is denoted by V{X). The exponential map assigns to each function a strictly 
positive probability measure: 

exp : R^ ^ P(X) f , • '"P^^^ 



T,xex exp(/(a;)) 



Here, exp(/) is to be taken coordinatewise. Using the exponential map, there is a nat- 
ural way to define exponential families of probability measures by considering families 
£ that are exponential images of linear subspaces of R'^. One natural class of such 
subspaces is given by limiting the interaction order. Following [21|S] one can define 

{/ e K"*^ : f{xB,XE\B) = .f{xB,x'E\B) for all Xb G Xb, Xe\b,x'e\B ^ '^E\b} : 
as the space of functions depending only on the subset B (Z E of their arguments. Then 

Ik = E 

|S|<fe 



is the space of functions depending on at most k of their arguments. Here, the sum over 
Ib is to be understoi 
exponential families 



Ib is to be understood as their span inside R"^ . This definition leads to a hierarchy of 



£i9 ■■■9£NiN-i)-i9'P{x) , (1) 

which is studied in information geometry . It allows to model networks by considering 
interactions of successively increasing order between their parts. It has been used to 
quantify the amount and degree of interaction in dynamical systems in a systematic 
fashion in [S]. 

The notion of interaction order can be understood from the fact that any P £ f has 
a (non-unique) representation as 

P(G)= n *(^i3) = ^exp( Yl 4>b{xb)] ■ 

B<ZE:\B\=k \BGE:\B\=k / 
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Thus, P G f means that 



P{G) 



1 



■H(G) 




H{G) 



'^ei ,62 /ei ,62 (' 



Q-ei 5 0-62 ) + ■ ■ ■ 



(2) 



e 



61,62 



...+ E 




61,62,. ..,efc 



As mentioned above any collection Oi , . . . , of observables defines a linear space as 
their span 



which in turn defines an exponential family := exp(/I). This exponential family 
is the collection of maximum entropy distributions for fixed expectation values of the 
observables. In particular, the mean values of the observables in a sample form a 
sufficient statistics for the model E^. Given data, one can determine the mean values 
of the observables on the data and then find a umque P in the closure f £ which has the 
same statistics as the data and maximal entropy among all such distributions. Finding 
this estimate in practice can be computationally expansive for a general linear space. In 
practice, an algorithm called iterative proportional fitting is used It is implemented 
in CIPI [12] and statistical software packages like loglin inside the software R. Since 
this method works directly on the vectors in R'^ it is limited to small X. Less than 
= 20 elements can be feasible. It is well known that for £i the maximum entropy 
distribution is just the product of the one-dimensional marginals of the data. 



2.1 Undirected Graphs and subgraph counts 

In the following we specialize the general theory to the case of undirected unlabeled 
homogeneous graphs. Here E = {(i, j) G [A^] x [A*'] i < j} is the set of potential edges, 
resulting in a symmetric adjacency matrix A — (ffly)i j=i,...,Ar, by setting Oij = Oe if 
{i,j) e E or {j, i) e E, and = 0. 

Unlabeled graphs are defined as equivalence classes and working with them in practice 
becomes infeasible quickly. It is a curiosity of complexity theory that the graph isomor- 
phism problem is in the class NP, but neither known to lie in P, nor to be NP-complete. 
In any case, at the current time there is no fast algorithm to determine whether two 
unlabeled graphs are isomorphic. Due to this unavoidable restriction we will always 
work with adjacency matrices. In particular the partition function is a sum over sym- 
metric adjacency matrices and P{H), for some unlabeled graph H denotes the sum 
of probabilities of adjacency matrices which have H as their unlabeled graph. This is 
complemented by the homogeneity that we assume in this setting. It means that we 
consider observables that are only evaluated on a small part of the system, and the 
value should show no systematic differences when varying the position. In particular, 
the probability of finding an edge should be the same for any pair of vertices. In the 
setting that was described in Section[2]this homogeneity is a certain symmetry require- 
ment. For instance the Erdos-Renyi graphs emerge from the exponential family where 
the linear space is the one-dimensional span of the edge count observable. 
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Generalizing from Erdos-Renyi graphs to higher dependency structures, natural ob- 
servables are the subgraph counts, defined as follows: Given a graph G with potential 
edge set E, we define the subgraph counts of a subgraph H as 

nu := # {unlabeled subgraphs H of G} . (3) 

For example, we denote by n_ the number of edges and by the number of triangles. 
For any undirected graph G, the counts of subgraphs with at most k edges form a 
sufficient statistics for the exponential family f j. considered above. This can be seen as 
follows. For each set B (- E of potential edges, we can define a function which takes 
the value one precisely if all edges in B are present, i.e. it counts the subgraph defined 
by the labeled graph 

fB{G) :^l[ae. (4) 

eS-B 

A classical argument from the theory of Markov random fields [13] shows that these 
functions {fg '■ B (Z E} form a basis of the whole space R*^, while with fc = we have 
/b & ^k- Uniqueness of the coefficients with respect to this basis depends on a choice 
of a reference configuration, the so called vacuum state. In our case the vacuum is the 
empty graph. The statement about sufficiency follows if one observes that the homo- 
geneity requirement translates into a condition on the coefficients in ([2|: Goefficients 
Cei,...,ei for different ei,...,e/ representing the same undirected unlabeled subgraph 
are required to be the same. Therefore the counts of unlabeled subgraphs with k edges 
(which are just sums over the fg for all B representing a specific subgraph) span the 
linear subspace space of functions depending only on k of their arguments, and taking 
equal values whenever these k arguments represent the same subgraph. Summarizing 
we have 

^ fs = -nn , (5) 

where the summation runs over all sets B which define a subgraph isomorphic to an 
unlabeled graph H. 

Note that fixing the number of vertices breaks the relation with the hierarchy in ([T|. 
Gonsider as an example the full model with subgraph counts up to 4 vertices. The 
energy has the form 

H = c-n- + (\_ni_ + c^n^ + . . . + c^n^ + c^n^ ■ (6) 

This distribution is an element of £q, but not all elements of £q are of this form, as we 
have not used a subgraph count for 6 edges forming a chain, which would be a subgraph 
on 7 vertices. It is also important to notice that changing one of the coefficients will 
generally change all of the expected counts. 

Apart from the subgraph counts, we often use the subgraph probabilities pu, that is, 
the probabilities for observing the subgraph H when drawing a random graph from the 
ensemble P on randomly selected nodes. This can be written as an expectation valu^ 
with respect to the distribution P as 

VH ■■= ifs), B^H , (7) 



^ The graph corresponding to B C E specifies the relations between the edges in B. As 
an example consider _Bl = {(li 2), (1, 6)} where the two edges share vertex 1 compared to 
B, I = {(1, 2), (4, 7)} where they are disconnected. 

^ We use the notation (■) for expectation values with respect to the graph ensemble P. 
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where B (Z E is any set of edges whose unlabeled graph is H. That any such B can 
be chosen is a consequence of the homogeneity that we require for our model. If the 
subgraphs whose counts we use as observables are small enough (when compared to the 
size of the network G), the homogeneity assumption allows to use counts on a single 
given network to estimate the ensemble expectation values. Quantities derived from a 
single network are denoted by a hat as in uh- 

When a subgraph probability pjj is estimated from a single network, it is given by the 
count of that subgraph, normalized by the maximum possible number of occurrences 
of that subgraph. This in turn is just the number of occurrences of H in the fully 
connected network F: 

_ nniG) 

PH - — TTn • (8) 
nH[r ) 



3 Measures of structural network properties 

3.1 Link density 

One of our main aims is to find good sets of network observables that capture the 
important structural properties of a graph. Obviously the first property is the number 
of links n_ or the link density 

p (9) 

with F denoting the fully connected graph. 

If only the expectation value of the link density or the number of links is specified, 
the corresponding maximum entropy ensemble is the ensemble of Erdos-Renyi random 
graphs or Bernoulli graphs. Its Hamiltonian is simply 

n{G) = c-n- =c- "-ij ■ (11) 

The main property of this ensemble is that there are no statistical dependencies between 
the links. The degrees of the nodes are distributed according to a Bernoulli distribution 
fully determined by the mean node degree. 



3.2 Degree distribution 

A distribution different from the Bernoulli distribution introduces statistical dependen- 
cies between the potential links. The resulting random graph ensemble will therefore 
be different from the Bernoulli graphs. How does the degree enter our framework? In 
a first step one might label the nodes using some labeling tt, and assign to each node 
an expected degree. This leads to exponential random graph model 



n{G, TI") = Ci ^ '^TrCijTrO) ; 
i 3 



(12) 
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with the Ci determining the degree of node In a second step one considers an 

ensemble of ensembles of random graphs, where the different ensembles correspond to 
different labellings, i.e. permutations of the node degrees. This leads to a probability 
distribution that is a convex combination of distributions P £ £i, and thus generally 
not contained in £i but in a so called mixture model. Moreover, this approach does 
not lead to a parameterization of the form ([2|. 

Our next aim is to understand how the degree distribution can be specified within the 
exponential families S/^ . The key point is that the fc-stai[^ counts allow to determine 
the moments of the degree distribution. With di — ciij being the degree of node i 
we call P{d) the probability that a randomly chosen node has degree d. We find 

1 ^ 

P(d) = ^P(G)-^<5,,(G),,. (13) 

G 4=1 



The moments of the degree distribution are 



For a given graph, we also have the empirical degree distribution 

1 ^ 

Paid) = J^Y^di{G).d . 

J=l 

with the moments 

1 ^ 

Pfc^^^Z^f^- (15) 

i=l 

Note that (/Ij.) = /x/j. There is a direct relationship between the moments of the degree 
distribution and the numbers of fc-stars 

Note that with this definition the number of 1-stars is two times the numbers of links 
ni — 2n_. Thus looking for the maximum entropy distribution for graphs with given 
moments of the degree distribution up to order femax corresponds to the exponential 
random graph model having non-zero coefficients only for fc-stars with k < fcmax. In 
particular, this distribution lies in £k,„a^- The parameterization using the fc-stars on 
the one hand side 

fc=l 

and the moments of the empirical degree distribution of the graph 

fcmax 

^ = E ^k^'^k , (18) 
fc=l 



A fc-star is a subgraph consisting of one central node that is connected to k other peripheral 
nodes, i.e. it contains k links. 



8 



can be converted into each other using 

^ N k 

^ j^^^^ s{k,m)dY' (19) 

i—1 ni—l 

N 

^ k\ 1^ s{k,m)%n ■ (20) 

m— 1 

with s(k,m) being the Stirhng numbers of the first kind. The inverse relationship 
between the fc-star counts and the moments of the degree distribution can be expressed 
using the Stirhng number of the second kind S{k, m): 

%= S{k,m)—nm ■ (21) 



For the parameters this leads to the relationships 



c™= |5(fc,m)c('*) (22) 

k—tn 

Cm^ = X] -j^s{k,m)ck ■ (23) 

k—m 



The parameterization ( 18 1 might still not be the best way to explore different degree 
distributions, because of the dependencies between the different moments. Instead of 
using the empirical moments ( |15[ ) as observables, one could think of observables that 
can be independently varied more easily, such as the variance, skewness, and kurtosis. 
Let us look in more detail at the variance, the other cases are similar. In the two star 
model 

HiG) = c_n_ + ClTIl (24) 
= cini + 02112 . (25) 

with ci = c_/2 and C2 = Cl, the resulting probability distribution can be equivalently 
parameterized by the pairs (ci ,02), (/ii, ^^2) or (/xi, ^2— Mi), the last being the mean and 
the variance of the degree distribution. The variance of the empirical degree distribution 
of a graph G is 

var(P(d)) = M2 - W ■ (26) 
One might think of a Hamiltonian of the form 

H(G) = 4''^Mi+B</) (fe-M?) ■ (27) 



This model is different from the two star model because it involves a non-linear trans- 
formation of the observables. 
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3.3 Cluster coefficient 

The cluster coefficient can be defined as three times the ratio between the numbers of 
triangles and the numbers of two stars in a given graph: 

a = ^ (28) 

with 

^ = N{N-^{N-2) 

- _ 6n^ 

" iV(iV-l)(iV-2) • ^^^> 

Thus the cluster coefficient for the ensemble P meeisures the conditional probability 
that if for three randomly selected nodes one node is connected to the two others, these 
are also connected. 

C = p{aij = l\a^k = 1, ajk = 1) . (32) 

In the context of social networks this property is also called "transitivity" because it 
means the probability that the friend of my friend is also my friend. If there are no 
statistical dependencies between the links, we would expect 

Cind ■■=?-■ (33) 

Moreover, if there arc statistical dependencies only between pairs of links (-P(G) € £2), 
such as in the two star model, one might expect 

p{aij = l|ojfc = l,ajk = 1) =p{aij = l|ojfc = 1) , 

or 

^ = ^, (34) 

PL P- 

respectively, and therefore the cluster coefficient would be equal to 

^ . (35) 
P- 

This is, however, not the correct expression for the two star model. Already the case 
of only three nodes provides an example: 

P{G) = P(oi,2, 02,3, 03,1) = 4 exp(c_n_ + c^n_) . 



There we have 



ZP(0,0,0) = 1 

ZP(1,0,0) = ZP(0,1,0) = ZP(0,0,1) = hi =expc_ 
ZP{1, 1,0) = ZP{1, 0, 1) = ZP{0, 1, 1) = ;i2 = exp(2c_ + Cl) 
ZP{1, 1, 1) = /i3 = exp(3c_ + 3cl) = Pk 
Z = 1 + 3/11 + 3/12 + fta . 
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Hence, the cluster coefficient is 



p^^ P(l,l,l) ^ P(l,l,l) 

PL P(l,l) P(1,1,1)+P(1,1,0) 

h2 + hs 1 + h2/h3 



On the other hand ( 35 \ becomes 

Pl h2 + h3 



p- h2 + h3 + h2 + hi 
1 



•'L2/h3 + l 

If the three random variables are only a subset of a larger set of random variables as 
in the case of larger networks, things become even more complicated. 
Nevertheless, if n_ and ni_ are sufficient statistics for the two star model, we should be 
able to express the cluster coefficient by these two variables. In particular, we should 
be able to express the expected number of triangles by the expected number of two 
stars and the expected number of links. 



3.4 Markov graphs 

If the Hamiltonian contains only the numbers of fc-stars and triangles, it defines the so 
called Markov graphs 6 . This class of random graphs is well known in the social net- 
work community. "Markov" here refers to the fact that in these graphs the occurrence 
of links without a common node is statistically independent. The only subgraphs where 
all pairs of links have a common node are the fc-stars and the triangles. From what 
we have discussed so far it becomes clear that these models can account already for a 
large range of degree distributions in contrast to the statement sometimes found in the 
literature that the exponential random graph models of the social network community 
only accounts for Poissonian degree distributions [4]. 



3.5 Assortativity 

Another widely studied property of a graph is its assortativity or disassortativity. 
In an assortative graph, high degree nodes are prevalently connected to other high 
degree nodes and low degree nodes to low degree ones. In disassortative graphs, high 
degree nodes tend to be connected to low degree nodes. A simple way to measure this 
property is the correlation coefficient between the remaining degrees of two connected 
nodes |10| . "Remaining" degree refers to the degree of a node after subtracting one 
for the link connecting this node to the other one. Empirical investigations showed 
that most social networks are assortative, while the Internet or biological networks are 
rather disassortative |10| . 

For an edge a^j = 1 the remaining degrees at either side of the edge are given by 

rfy* = X] fflfci = - 1 , 
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The assortativity is then given by the correlation coefficient between the remaining 
degrees at either side of an edge: 



„2_ 



r 



Using that we consider undirected graphs, i.e. A is symmetric, ((d'"'*)") = {{cf for 
n = 1, 2, . . ., and Unearity of expectation values, this simplifies to 

All relevant quantities can now be expressed in terms of subgraph counts (see appendix 
for details): 

"lV»l+»J (37) 



1L V "L / 



In order to express the assortativity by the subgraph probabilities we list again all 
subgraph probabilities including the missing ones: 

" N{N - 1){N - 2) ^^^^ 

= N{N-l)[N-2) 

P^ ~ N{N -1){N -2){N -3) ^ ' 

^ _ 2nu . .„s 

~ N{N -1){N -2){N -3) ^ ' 



Thus 



(JV 



-2)pl Vpl PL ; 



(7V-2)PL 1, PL + V 



(43) 



The assortativity coefficient is zero, if 



Sjj^L _^ nu ^ ul ^ 
riL til n_ 

or equivalently 

R ^ (JV - 3)pl ^ (iV - 2)pL .45. 

Pl R p- 

which is a relation between conditional probabilities that is fulfilled in particular if 
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Again, this does not mean that exponential random graphs with pairwise interactions 
only, such as the two star model, have a vanishing assortativity coefficient r^. The same 
arguments as for the cluster coefficient apply. For Markov graphs, defined as random 
graphs for which links without a common node occur statistically independently, we 
can make an interesting observation: Condition (461 is fulfilled, and the assortativity 
is fully controlled by the cluster coefficient. 



4 Network structure in simple exponential random graph models 

Let us consider an exponential random graph model 

H(G) = ^CHPff(G) (47) 

H 

where the summation runs over some set of subgraphs. If we fix the number of nodes. 



then (47 1 defines an energy landscape in the space of all graphs with N nodes. High 



probability corresponds to low energy, therefore the minima of ( 47 1 should correspond 
to the graphs that are most probable and therefore "typical" in the ensemble defined 
by this model. A second possible reason for a graph being typical is a high number of 
isomorphic graphs, but for sufficiently low temperatures this effect will be dominated 
by the effect of the energy]^ Because we expressed the energy using the subgraph 
probabilities ([8| it is obvious that the empty graph has zero energy and the energy of 
the fully connected graph is equal to the sum of all coefficients Tii^F) — "^f^ Cfj. Thus 



we realize a first property of (47 1: If all coefficients Cfj are sufficiently negative, the 
fully connected graph has minimal energy and is the most probable and only typical 
graph. If, on the other hand, all coefficients are sufficiently positive, the empty graph 
is the most probable and therefore typical graph. We conclude that in order to get 
non-trivial typical graphs, at least one coefficient has to have a different sign then the 
other coefficients. A more detailed analysis will show that additional requirements are 
needed in order to get "interesting" typical graphs. In the following we discuss this for 
some simple exponential random graph models, and shed some light on the difficulties 
that were reported by several authors that tried to use them to describe real world 
networks [7]. 



4.1 The two star model 

The two star model has the form 

H = c_p_ + ClPl . (48) 

Fig.[l]shows the position of all 6-node graphs in the (p_ , pL)-plane. The convex hull of 
these points defines all possible expectation values of p_ and pi_ for two star models. 
By linearity, the energy landscape is a plane in a third dimension and extreme values lie 
on the boundary of the convex region. A positive value of c_ and a negative value of Cl 
result in the minimal energy graphs being located on the upper boundary of the region. 

* We did not introduce a temperature explicitly, but it can be done easily be setting 
HiG) = E{G)/T with E{G) being the energy and T the temperature. Changing the tem- 
perature corresponds to a rescaling of all coefficients cu in 'H{G) by a constant factor. 
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Fig. 1 Position of all graphs with N = 6 nodes in the (p— , pL)-plane. On each point a bar is 
drawn according to its multiplicity, i.e. the number of adjacency matrices with these link and 
two star counts. The dashed line shows the position of Erdos-Renyi graphs. 



As visible in Fig. [T] the empty and the fully connected graphs are the only graphs lying 
on this boundary. Therefore, for sufficiently low temperatures, the graph ensemble is 
supported only on these two graphs In the opposite case of negative c_ and positive o_ 
the minimal energy graphs lie on the lower boundary of the region. These graphs have 
less two stars than the Erdos-Renyi graphs with the same link density. This is the only 
structural property of graphs that can be quantified by the two star model ( 48 1. Fig. [2] 
gives an example of such a graph ensemble and shows its typical graphs corresponding 
to the three most probable combinations of link and two star counts. The two most 
probable positions A and B have a higher energy than the graphs of lowest energy at 
C, but gain probability from their multiplicity (compare Fig. [T]). 



4.2 Cluster coefficient and assortativity 

H = CsPix + CuPu (49) 



Another simple exponential random graph model is given by ( 49 1 . By the same reason- 
ing as above Oj and should have opposite signs in order to have non-trivial typical 
graphs. Fig. [3] shows again the region of admissible expectation values and pu, all 
6-node graphs, and the line Pu = P\ ~ P- of the Erdos-Renyi graphs. The minimal 
energy graphs can be easily understood in this example. In the case of negative and 
positive Cu triangles are preferred. The minimal energy graphs are lying on the lower 
boundary of the admissible region and are non-connected graphs with fully connected 
components that could be considered as the ideal case of a "community structure" (see 
for instance [Qj). The size of the components depends on the concrete values of the 
parameters. If the components are of difl^erent size, the graph is fully assortative, i.e. 
r = 1. In the opposite case of positive Ck and negative Cu triangles are suppressed, and 
the minimal energy graph is a complete bipartite graph. The numbers of nodes in the 
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Fig. 2 Graph ensemble over graphs of six nodes as specified by the two star model with 
parameters c_ = —80 and Cl = 120. At the right typical graphs corresponding to the three 
most probable combinations of link and two star counts are shown. 




Fig. 3 Position of all graphs with N = 6 nodes in the (pu, p\)-plane. Graphs corresponding 
to the extremal points of the convex hull are shown to the right. 

two subsets are equal if the total number of nodes is even. Fig. |3|\ shows this graph in 
the case of 6 nodes. If the total number of nodes is odd, the numbers of nodes in the 
two subsets wiU differ by one. As a consequence the minimal energy graph in this case 
will be fully disassortative. This disassortativity is a consequence of the bipartiteness 
and the different size of the components, thus not very informative on its own. The 
same applies for the observed assortativity in the first case that is also the consequence 
of the very specific structure of these minimal energy graphs. At the moment it is not 
clear to which extent it is possible and reasonable to explain assortativity and disas- 
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sortativity by a catalog of paradigmatic structures that correspond to minimal energy 
graphs in exponential random graphs model as in these simple examples. 



5 Discussion 

We have presented a formalism that allows to study and quantify systematically the 
structures of networks as statistical dependencies. In particular, we showed how popular 
measures of network structures such as the degree distribution, the cluster coefficient 
and the assortativity coefficient could be expressed by subgraph probabilities. This 
allows to situate graph ensembles with predetermined values of these properties in the 
elements of the hierarchy of exponential families ([T]| which illuminates both their re- 
lationship and to which extent they specify redundant information about the graph 
structure. Very often only a single property is studied. For instance in [10], a random 
graph model with given degree distribution and additionally given joint remaining de- 
gree distribution for connected links is considered. This model allows for control of 
the degree of assortativity, corresponding to a variation of P{G) in one direction. De- 
pending on the exponential family chosen, there are many other directions with 
non- vanishing assortativity. Thus it remains unclear how relevant this particular direc- 
tion is. 

By identifying the subgraph counts as sufficient statistics for exponential random graph 
models we also provide a link to systematically incorporate motif analysis in the anal- 
ysis of network structures. A more detailed analysis of this aspect is beyond the scope 
of this paper and will be presented elsewhere. 



A Expressing the assortativity coefficient by subgraph counts 

Due to the homogeneity assumption all expectation values occurring in l |36| can be estimated 
as the average over all links in a given graph. Then the following relations to the subgraph 
counts are derived: 

^ . a,' T d^'^ 

2 ■ riL ny_ 
2 ■ n_ n_ 

Since the average is performed with respect to all links, nodes with high degree get high 
weights. Thus, even though cf^'^ = di ~ 1, ^ ^ j d''* is not equal to T _ w 
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3 ■ + riu 



= £ifci 

^i,jj^k"-ij "-"ki ~^^i,jj^kjtk' "-ii^-kiO-k'i 

2 ■ riL + 6 ■ 
2 ■ n_ 



Hence the assortativity expressed in subgraph counts is 



_ 3 ■ Tlx + riu ■ 
^2 = 



riL + 3 ■ - - _ ^ 

n_ (3 ■ + nu) - nj^ 
(n_nL + 3n_n^ — n^) 

1 



— ( 


^371^ 1 


. liu" 


n^_ \ 






n_ 


/ 3n^ 
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